Skip to content

fix(lookup_error): weak fuzzy hits no longer suppress semantic fallback#20

Merged
critesjosh merged 3 commits intomainfrom
feat/lookup-error-score-threshold
May 2, 2026
Merged

fix(lookup_error): weak fuzzy hits no longer suppress semantic fallback#20
critesjosh merged 3 commits intomainfrom
feat/lookup-error-score-threshold

Conversation

@critesjosh
Copy link
Copy Markdown
Collaborator

Summary

Reported in the v1.20.0 dogfood test: aztec_lookup_error("note already nullified") returned the unrelated catalog entry "Contract already initialized" with a Jaccard word-overlap score of 54, and the semantic-documentation fallback never fired. The early-return in lookupAztecError treated any totalMatches > 0 as authoritative regardless of confidence, so a noise-floor fuzzy match shadowed the actually-useful answer.

Fix

Introduce STRONG_MATCH_THRESHOLD = 70. Catalog hits below the threshold no longer short-circuit semantic fallback — they're kept in the result so the formatter can still render them as low-confidence cues, but the tool now falls through to the semantic path.

The threshold aligns with the existing score system in src/utils/error-lookup.ts:

Score Match type Effect
100 exact-code / hex-signature strong → short-circuits
95 exact-pattern strong → short-circuits
70-80 substring strong → short-circuits (boundary at 70)
50-65 word-overlap (Jaccard) weak → falls through to semantic

A codeMatch (ripgrep over cloned source) by itself still short-circuits — those are direct grep hits with no fuzziness.

When weak hints exist alongside semantic results, the message field names the situation explicitly ("No strong static match — N low-confidence fuzzy hint(s) shown below. Showing relevant documentation.") rather than pretending nothing matched.

Tests added (6)

  • threshold boundary: score === 70 still short-circuits
  • codeMatch alone (no catalog) still short-circuits
  • regression: "note already nullified" with score-54 word-overlap → semanticHealth: "ok", semanticResults populated, weak hint preserved in result.catalogMatches, message contains "low-confidence"
  • multiple weak hints (max score 65) still fall through
  • weak hints + no client → "skipped", message names the situation and points at API_KEY
  • weak hints + semantic returning empty → "no_results", message acknowledges both signals

tsc --noEmit clean. All 247 tests pass (was 241).

Test plan

  • npm run build (tsc)
  • npx vitest run
  • After release: re-run aztec_lookup_error("note already nullified") against v1.21+ and confirm semantic fallback returns documentation about note lifecycle / nullification.

🤖 Generated with Claude Code

critesjosh and others added 3 commits May 2, 2026 01:00
Reported in v1.20.0 dogfood: `aztec_lookup_error("note already nullified")`
returned the unrelated catalog entry "Contract already initialized" with
a Jaccard word-overlap score of 54, and the semantic-documentation
fallback never fired because the early-return treated any catalog hit
— regardless of confidence — as authoritative.

Fix: introduce STRONG_MATCH_THRESHOLD = 70. Below that, the catalog
hits are kept in the result (so the formatter can still render them as
low-confidence cues), but the tool falls through to the semantic
fallback. The threshold aligns with the score system in
utils/error-lookup.ts:

  100  exact-code / hex-signature  → strong, always short-circuits
   95  exact-pattern               → strong
   70-80 substring                 → strong (boundary at 70)
   50-65 word-overlap (Jaccard)    → weak, falls through

When weak hints exist alongside the semantic results, the message
field now names the situation ("No strong static match — N
low-confidence fuzzy hint(s) shown below. Showing relevant
documentation.") instead of pretending nothing matched.

Tests added (6):
- threshold boundary: score === 70 still short-circuits
- codeMatch alone (no catalog) still short-circuits
- regression: "note already nullified" with score-54 word-overlap →
  semanticHealth='ok', semanticResults populated, weak hint preserved
  in result.catalogMatches, message contains "low-confidence"
- multiple weak hints (max score 65) still fall through
- weak hints + no client → "skipped", message names the weak-hint
  situation and points at API_KEY
- weak hints + semantic returning empty → "no_results", message
  acknowledges both signals

All 247 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Formatter: when semantic results exist alongside weak-only catalog
  hits, render Documentation FIRST, then "## Lower-Confidence Catalog
  Hints" with an italicized note that the docs above are likely more
  authoritative. Prevents the LLM consumer from anchoring on a
  misleading top hit (the original "Contract already initialized"
  failure mode rendered the bogus entry under "## Known Errors" with
  full **bold name** + cause/fix before the actually-relevant docs).

- category filter preserved: when the caller passes `category` and
  the catalog produced any in-category match (even a weak one), keep
  the pre-PR short-circuit. Falling through to a category-agnostic
  semantic search would surface out-of-scope docs and confuse a user
  who explicitly narrowed the request.

- API_KEY guidance reworded from "An API_KEY would enable..." to
  "Set API_KEY... (get a free key by running /mcp-key in the
  Aztec/Noir Discord: https://discord.gg/xMud5StFyA)" — actionable
  next-step phrasing matches the rest of the project's wording.

- 3 more tests covering the codex coverage gaps:
  * weak-only + version-mismatch: gate blocks semantic, weak hint
    preserved, message names both signals.
  * weak-only + allowVersionMismatch=true: gate skipped, semantic
    runs.
  * category filter + weak-only: short-circuits (does NOT fall
    through to category-agnostic semantic).

250 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex round-2 finding: the weak-only catalog note `_These are
word-overlap fuzzy matches, not direct hits — the documentation
results above are likely more authoritative._` rendered even when
there were no semantic results above (no client, version mismatch,
backend failed, semantic returned empty). Stale copy could mislead
a user into thinking they should look up at a documentation section
that doesn't exist.

Now the "documentation above" phrasing is gated on
`renderSemanticFirst` (which already encodes "semantic returned
hits AND we reordered"). The other weak-only paths use neutral
copy: `_Treat as low-confidence cues only._`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@critesjosh critesjosh force-pushed the feat/lookup-error-score-threshold branch from 38ca409 to 23d47f2 Compare May 2, 2026 01:21
@critesjosh critesjosh merged commit 8f00f0c into main May 2, 2026
6 checks passed
@critesjosh critesjosh deleted the feat/lookup-error-score-threshold branch May 2, 2026 01:25
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

🎉 This PR is included in version 1.21.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant